Fault-Tolerant Entity Resolution with the Crowd
نویسندگان
چکیده
In recent years, crowdsourcing is increasingly applied as a means to enhance data quality. Although the crowd generates insightful information especially for complex problems such as entity resolution (ER), the output quality of crowd workers is often noisy. That is, workers may unintentionally generate false or contradicting data even for simple tasks. The challenge that we address in this paper is how to minimize the cost for task requesters while maximizing ER result quality under the assumption of unreliable input from the crowd. For that purpose, we first establish how to deduce a consistent ER solution from noisy worker answers as part of the data interpretation problem. We then focus on the next-crowdsource problem which is to find the next task that maximizes the information gain of the ER result for the minimal additional cost. We compare our robust data interpretation strategies to alternative state-of-the-art approaches that do not incorporate the notion of fault-tolerance, i.e., the robustness to noise. In our experimental evaluation we show that our approaches yield a quality improvement of at least 20% for two real-world datasets. Furthermore, we examine task-to-worker assignment strategies as well as task parallelization techniques in terms of their cost and quality trade-offs in this paper. Based on both synthetic and crowdsourced datasets, we then draw conclusions on how to minimize cost while maintaining high quality ER results.
منابع مشابه
A New Fault Tolerant Nonlinear Model Predictive Controller Incorporating an UKF-Based Centralized Measurement Fusion Scheme
A new Fault Tolerant Controller (FTC) has been presented in this research by integrating a Fault Detection and Diagnosis (FDD) mechanism in a nonlinear model predictive controller framework. The proposed FDD utilizes a Multi-Sensor Data Fusion (MSDF) methodology to enhance its reliability and estimation accuracy. An augmented state-vector model is developed to incorporate the occurred senso...
متن کاملFault tolerant system with imperfect coverage, reboot and server vacation
This study is concerned with the performance modeling of a fault tolerant system consisting of operating units supported by a combination of warm and cold spares. The on-line as well as warm standby units are subject to failures and are send for the repair to a repair facility having single repairman which is prone to failure. If the failed unit is not detected, the system enters into an unsafe...
متن کاملNovel efficient fault-tolerant full-adder for quantum-dot cellular automata
Quantum-dot cellular automata (QCA) are an emerging technology and a possible alternative for semiconductor transistor based technologies. A novel fault-tolerant QCA full-adder cell is proposed: This component is simple in structure and suitable for designing fault-tolerant QCA circuits. The redundant version of QCA full-adder cell is powerful in terms of implementing robust digital functions. ...
متن کاملNovel efficient fault-tolerant full-adder for quantum-dot cellular automata
Quantum-dot cellular automata (QCA) are an emerging technology and a possible alternative for semiconductor transistor based technologies. A novel fault-tolerant QCA full-adder cell is proposed: This component is simple in structure and suitable for designing fault-tolerant QCA circuits. The redundant version of QCA full-adder cell is powerful in terms of implementing robust digital functions. ...
متن کاملFault Diagnosis and Fault-Tolerant SVPWM Technique of Six-phase Converter under Open-Switch Fault
In this paper, a new open-switch fault diagnosis method is proposed for the six-phase AC-DC converter based on the difference between the phase current and the corresponding reference using an adaptive threshold. The open-switch faults are detected without any additional equipment and complicated calculations, since the proposed fault detection method is integrated with the controller required ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1512.00537 شماره
صفحات -
تاریخ انتشار 2015